Playout policy adaptation with move features
نویسنده
چکیده
Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We also propose to learn a policy not only using the moves but also according to the features of the moves. We test the resulting algorithms named Playout Policy Adaptation (PPA) and Playout Policy Adaptation with move Features (PPAF) on Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Domineering, Knightthrough, Misere Knightthrough and Nogo. The experiments compare PPA and PPAF to Upper Confidence for Trees (UCT) and to the closely related Move-Average Sampling Technique (MAST) algorithm.
منابع مشابه
Memorizing the Playout Policy
Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). Playout Policy Adaptation with move Features (PPAF) is a state of the art MCTS algorithm that learns a playout policy online. We propose a simple modification to PPAF consisting in memorizing the learned policy from one move to the next. We test PPAF with memorization (PPAFM) against PPAF and UCT fo...
متن کاملPlayout Policy Adaptation for Games
Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We test the resulting algorithm named Playout Policy Adaptation (PPA) on Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Domineering, Go, Knightthrough, Misere Knightthrough, Nogo and Misere Nogo. For most of ...
متن کاملOptimization of a packet video receiver under different levels of delay jitter: an analytical approach
This paper studies the problem of analyzing and designing optimal playout adaptation policies for packet video receivers (PVRs) that operate in a delay jitter inducing best-effort network, like the current Internet. The developed system model is built around the Ek/Di/1/N phase-type queue and allows for the effective modeling of key design and system parameters, such as: the level of delay jitt...
متن کاملNested Rollout Policy Adaptation with Selective Policies
Monte Carlo Tree Search (MCTS) is a general search algorithm that has improved the state of the art for multiple games and optimization problems. Nested Rollout Policy Adaptation (NRPA) is an MCTS variant that has found record-breaking solutions for puzzles and optimization problems. It learns a playout policy online that dynamically adapts the playouts to the problem at hand. We propose to enh...
متن کاملJoint Power/Playout Control Schemes for Media Streaming over Wireless Links
We investigate transmission and playout policies for streaming media over a wireless link. In particular, we choose both the power at the transmitter and the playout rate at the receiver, in order to minimize the power consumption and maximize the media quality. We formulate the problem using a dynamic programming approach, study the structural properties of the optimal solution, develop justif...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Theor. Comput. Sci.
دوره 644 شماره
صفحات -
تاریخ انتشار 2016